Class-based Word Sense Induction for dot-type nominals

نویسندگان

Lauren Romeo

Héctor Martı́nez Alonso

Núria Bel

چکیده

This paper describes an effort to capture the sense alternation of dot-type nominals using Word Sense Induction (WSI). We propose dot-type nominals generate more semantically consistent groupings when clustered into more than two clusters, accounting for literal, metonymic and underspecified senses. Using a class-based approach, we replace individual lemmas with a placeholder representing the entire dot type, which also compensates for data sparsity. Although the distributional evidence does not motivate an individual cluster for each sense, we discuss how our results empirically support theoretical proposals regarding dot types.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Compositionality in Multi-Word Expressions

Identifying whether a multi-word expression (MWE) is compositional or not is important for numerous NLP applications. Sense induction can partition the context of MWEs into semantic uses and therefore aid in deciding compositionality. We propose an unsupervised system to explore this hypothesis on compound nominals, proper names and adjective-noun constructions, and evaluate the contribution of...

متن کامل

Detecting selectional behavior of complex types in text

In this paper, we discuss some aspects of selectional behavior of dot objects, and present an algorithm for clustering selector contexts for dot nominals according to the selected type. The clustering algorithm is based on the notion of contextualized similarity between selector contexts and defines a similarity measure for contextual equivalents of the target nominal.

متن کامل

MaxMax: A Graph-Based Soft Clustering Algorithm Applied to Word Sense Induction

This paper introduces a linear time graph-based soft clustering algorithm. The algorithm applies a simple idea: given a graph, vertex pairs are assigned to the same cluster if either vertex has maximal affinity to the other. Clusters of varying size, shape, and density are found automatically making the algorithm suited to tasks such Word Sense Induction (WSI), where the number of classes is un...

متن کامل

Word Sense Induction and Disambiguation Rivaling Supervised Methods

Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context and successful approaches are known to benefit many applications in Natural Language Processing. Although, supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words...

متن کامل

Structured Generative Models of Continuous Features for Word Sense Induction

We propose a structured generative latent variable model that integrates information from multiple contextual representations for Word Sense Induction. Our approach jointly models global lexical, local lexical and dependency syntactic context. Each context type is associated with a latent variable and the three types of variables share a hierarchical structure. We use skip-gram based word and d...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Class-based Word Sense Induction for dot-type nominals

نویسندگان

چکیده

منابع مشابه

Detecting Compositionality in Multi-Word Expressions

Detecting selectional behavior of complex types in text

MaxMax: A Graph-Based Soft Clustering Algorithm Applied to Word Sense Induction

Word Sense Induction and Disambiguation Rivaling Supervised Methods

Structured Generative Models of Continuous Features for Word Sense Induction

عنوان ژورنال:

اشتراک گذاری